Introduction

A quick play with the comorbidity package to see the difference in Charlson and Elixhauser scores when using full and truncated ICD10 codes.

Packages

library(comorbidity)
library(tidyverse)
## -- Attaching packages ---------------------- tidyverse 1.2.1 --
## v ggplot2 3.0.0     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.6
## v tidyr   0.8.1     v stringr 1.3.1
## v readr   1.1.1     v forcats 0.3.0
## -- Conflicts ------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract

Initialise dataframes

The package lets you make dummy data really easily. I’m going to make 30,000 observations of 5000 indiviudals (differing numbers of disease) then create a copy dataset with ICD10 codes truncated to 3 characters.

#Full data
data_full_code <- data.frame(
  id = sample(1:5000, size = 30000, replace = TRUE),
  code = sample_diag(n = 30000, version = "ICD10_2011"),
  stringsAsFactors = FALSE
)

data_full_code %<>%
  arrange(id)
#Truncated data
data_truncated <-
  data_full_code %>% 
  mutate(code = strtrim(code, 3))

data_full_code
data_truncated

Calculate Charlson Index

Now run the comorbidity() function to create Charlson scores for both datasets.

charlson_full <- comorbidity(data_full_code, id = "id", code = "code", score = "charlson_icd10")
charlson_trunc <- comorbidity(data_truncated, id = "id", code = "code", score = "charlson_icd10")
charlson_full
charlson_trunc

Are they the same

identical(charlson_full, charlson_trunc)
## [1] FALSE

Of course not, how different?

Visualise

Scores for full data

charlson_full %>% 
  ggplot(aes(score)) +
  geom_bar()

…and truncated data. Slight drops in higher scores and increases in lower/zero scores as expected but not a huge difference

charlson_trunc %>% 
  ggplot(aes(score)) +
  geom_bar()

The comorbidity() function also calculates a weighted score which increases scores for certain diseases (e.g. diabetes with complications). I’d expect to see a bigger difference here…

charlson_full %>% 
  ggplot(aes(wscore)) +
  geom_bar()

truncated data…

charlson_trunc %>% 
  ggplot(aes(wscore)) +
  geom_bar()

This holds up surprisingly well…

Elixhauser

Repeat for elixhauser

elix_full <- comorbidity(data_full_code, id = "id", code = "code", score = "elixhauser_icd10")
elix_trunc <- comorbidity(data_truncated, id = "id", code = "code", score = "elixhauser_icd10")
elix_full
elix_trunc

Visualise

Full codes

elix_full %>% 
  ggplot(aes(score)) +
  geom_bar()

Truncated

elix_trunc %>% 
  ggplot(aes(score)) +
  geom_bar()

Again, the same pattern where there are more zero and 1 scores and fewer higher scores. Again this is a very mild change though.

Weighted score full data

elix_full %>% 
  ggplot(aes(wscore)) +
  geom_bar()

truncated

elix_trunc %>% 
  ggplot(aes(wscore)) +
  geom_bar()

Again holds up better than expected.

Of course - this is dummy data and difficult to know how well this would work in the real world!

session_info

devtools::session_info()
## Session info -------------------------------------------------------------
##  setting  value                       
##  version  R version 3.5.1 (2018-07-02)
##  system   x86_64, mingw32             
##  ui       RTerm                       
##  language (EN)                        
##  collate  English_United Kingdom.1252 
##  tz       Europe/London               
##  date     2018-08-24
## Packages -----------------------------------------------------------------
##  package     * version date       source        
##  assertthat    0.2.0   2017-04-11 CRAN (R 3.5.0)
##  backports     1.1.2   2017-12-13 CRAN (R 3.5.0)
##  base        * 3.5.1   2018-07-02 local         
##  bindr         0.1.1   2018-03-13 CRAN (R 3.5.0)
##  bindrcpp    * 0.2.2   2018-03-29 CRAN (R 3.5.0)
##  broom         0.5.0   2018-07-17 CRAN (R 3.5.1)
##  cellranger    1.1.0   2016-07-27 CRAN (R 3.5.0)
##  checkmate     1.8.5   2017-10-24 CRAN (R 3.5.0)
##  cli           1.0.0   2017-11-05 CRAN (R 3.5.0)
##  codetools     0.2-15  2016-10-05 CRAN (R 3.5.1)
##  colorspace    1.3-2   2016-12-14 CRAN (R 3.5.0)
##  comorbidity * 0.1.1   2018-03-30 CRAN (R 3.5.1)
##  compiler      3.5.1   2018-07-02 local         
##  crayon        1.3.4   2017-09-16 CRAN (R 3.5.0)
##  datasets    * 3.5.1   2018-07-02 local         
##  devtools      1.13.6  2018-06-27 CRAN (R 3.5.1)
##  digest        0.6.16  2018-08-22 CRAN (R 3.5.1)
##  dplyr       * 0.7.6   2018-06-29 CRAN (R 3.5.1)
##  evaluate      0.11    2018-07-17 CRAN (R 3.5.1)
##  forcats     * 0.3.0   2018-02-19 CRAN (R 3.5.0)
##  ggplot2     * 3.0.0   2018-07-03 CRAN (R 3.5.1)
##  glue          1.3.0   2018-07-17 CRAN (R 3.5.1)
##  graphics    * 3.5.1   2018-07-02 local         
##  grDevices   * 3.5.1   2018-07-02 local         
##  grid          3.5.1   2018-07-02 local         
##  gtable        0.2.0   2016-02-26 CRAN (R 3.5.0)
##  haven         1.1.2   2018-06-27 CRAN (R 3.5.1)
##  hms           0.4.2   2018-03-10 CRAN (R 3.5.0)
##  htmltools     0.3.6   2017-04-28 CRAN (R 3.5.0)
##  httr          1.3.1   2017-08-20 CRAN (R 3.5.0)
##  jsonlite      1.5     2017-06-01 CRAN (R 3.5.0)
##  knitr         1.20    2018-02-20 CRAN (R 3.5.0)
##  labeling      0.3     2014-08-23 CRAN (R 3.5.0)
##  lattice       0.20-35 2017-03-25 CRAN (R 3.5.1)
##  lazyeval      0.2.1   2017-10-29 CRAN (R 3.5.0)
##  lubridate     1.7.4   2018-04-11 CRAN (R 3.5.0)
##  magrittr    * 1.5     2014-11-22 CRAN (R 3.5.0)
##  memoise       1.1.0   2017-04-21 CRAN (R 3.5.0)
##  methods     * 3.5.1   2018-07-02 local         
##  modelr        0.1.2   2018-05-11 CRAN (R 3.5.0)
##  munsell       0.5.0   2018-06-12 CRAN (R 3.5.0)
##  nlme          3.1-137 2018-04-07 CRAN (R 3.5.1)
##  parallel      3.5.1   2018-07-02 local         
##  pillar        1.3.0   2018-07-14 CRAN (R 3.5.1)
##  pkgconfig     2.0.2   2018-08-16 CRAN (R 3.5.1)
##  plyr          1.8.4   2016-06-08 CRAN (R 3.5.0)
##  purrr       * 0.2.5   2018-05-29 CRAN (R 3.5.0)
##  R6            2.2.2   2017-06-17 CRAN (R 3.5.0)
##  Rcpp          0.12.18 2018-07-23 CRAN (R 3.5.1)
##  readr       * 1.1.1   2017-05-16 CRAN (R 3.5.0)
##  readxl        1.1.0   2018-04-20 CRAN (R 3.5.0)
##  rlang         0.2.2   2018-08-16 CRAN (R 3.5.1)
##  rmarkdown     1.10    2018-06-11 CRAN (R 3.5.0)
##  rprojroot     1.3-2   2018-01-03 CRAN (R 3.5.0)
##  rstudioapi    0.7     2017-09-07 CRAN (R 3.5.0)
##  rvest         0.3.2   2016-06-17 CRAN (R 3.5.0)
##  scales        1.0.0   2018-08-09 CRAN (R 3.5.1)
##  stats       * 3.5.1   2018-07-02 local         
##  stringi       1.2.4   2018-07-20 CRAN (R 3.5.0)
##  stringr     * 1.3.1   2018-05-10 CRAN (R 3.5.0)
##  tibble      * 1.4.2   2018-01-22 CRAN (R 3.5.0)
##  tidyr       * 0.8.1   2018-05-18 CRAN (R 3.5.0)
##  tidyselect    0.2.4   2018-02-26 CRAN (R 3.5.0)
##  tidyverse   * 1.2.1   2017-11-14 CRAN (R 3.5.0)
##  tools         3.5.1   2018-07-02 local         
##  utils       * 3.5.1   2018-07-02 local         
##  withr         2.1.2   2018-03-15 CRAN (R 3.5.0)
##  xml2          1.2.0   2018-01-24 CRAN (R 3.5.0)
##  yaml          2.2.0   2018-07-25 CRAN (R 3.5.1)